Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(object store): introduce new s3 object store via OpenDAL #14409

Merged
merged 15 commits into from
Feb 20, 2024

Conversation

wcy-fdu
Copy link
Contributor

@wcy-fdu wcy-fdu commented Jan 8, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

detail in #14321

Checklist

- [ ] I have written necessary rustdoc comments
- [ ] I have added necessary unit tests and integration tests
- [ ] I have added test labels as necessary. See details.
- [ ] I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features #7934).
- [ ] My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).

  • All checks passed in ./risedev check (or alias, ./risedev c)
    - [ ] My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)

- [ ] My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

- [ ] My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has totally checked 4697 files.

Valid Invalid Ignored Fixed
2049 1 2647 0
Click to see the invalid file list
  • src/object_store/src/object/opendal_engine/opendal_s3.rs

src/object_store/src/object/opendal_engine/opendal_s3.rs Outdated Show resolved Hide resolved
@wcy-fdu
Copy link
Contributor Author

wcy-fdu commented Jan 19, 2024

I have verified the correctness of OpenDAL s3.

  • before(via aws sdk): we need to set credential in environment by aws configure or just set access_key in pod env, then sdk will call from_env() to get the credential.
  • For new OpenDAL S3: we need to set AWS_REGION, AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, and for those S3 compatible storage, we need to set one more RW_S3_ENDPOINT. This is the same as current docker compose env.

Next, we need to do perf test and stability test for the new OpenDAL S3, but before that we need to apply this change in RisingWave Operator and Kube Bench. I have created an issue for this, and will start testing after this is done. So let's merge this PR first to allow other components to be easily tested. @hzxa21

The next PR will switch current minio to use OpenDAL S3.

@wcy-fdu wcy-fdu requested review from hzxa21 and zwang28 January 19, 2024 05:58
@kwannoel kwannoel self-requested a review January 23, 2024 08:38
@hzxa21 hzxa21 requested a review from MrCroxx January 23, 2024 11:30
@wcy-fdu
Copy link
Contributor Author

wcy-fdu commented Jan 30, 2024

@wcy-fdu wcy-fdu requested a review from a team as a code owner February 1, 2024 09:16
@wcy-fdu
Copy link
Contributor Author

wcy-fdu commented Feb 1, 2024

Add some enhancement for opendal s3:

  • add concurrent write(v0.44.2)
  • Set the retry parameters the same as the current s3
  • add trust-dns feature(by bump version to 0.44.2)

Copy link

gitguardian bot commented Feb 1, 2024

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
9425213 Triggered Generic Password 50bc801 ci/scripts/regress-test.sh View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the commit is messed up with a merged PR. Let's resolve it first.

Also, I don't see OpendalObjectStore::new_s3_engine being used. I thought we are going to use some flag to control the switch.

@wcy-fdu
Copy link
Contributor Author

wcy-fdu commented Feb 1, 2024

It seems that the commit is messed up with a merged PR. Let's resolve it first.

Umm... never mind, I am build RisingWave image right now, which need to be consistent with nightly0201, so I cherry pick #14844.
Will rebase main after https://buildkite.com/risingwavelabs/docker/builds/15123 is ready(blocked now)

src/object_store/src/object/mod.rs Outdated Show resolved Hide resolved
src/object_store/src/object/mod.rs Outdated Show resolved Hide resolved
src/object_store/src/object/opendal_engine/opendal_s3.rs Outdated Show resolved Hide resolved
…nto wcy/opendal_s3

consistent with nightly-0207 for perf test.
@wcy-fdu
Copy link
Contributor Author

wcy-fdu commented Feb 8, 2024

@wcy-fdu
Copy link
Contributor Author

wcy-fdu commented Feb 8, 2024

Take eyes on q104:

grafana url

opendal-s3:https://grafana.test.risingwave-cloud.xyz/d/EpkBw5W4k/risingwave-dev-dashboard?orgId=1&var-da[…]ar-namespace=nexmark-kafka-benchmark-20240207-124259
aws sdk https://grafana.test.risingwave-cloud.xyz/d/EpkBw5W4k/risingwave-dev-dashboard?orgId=1&var-da[…]74000&var-namespace=nexmark-bs-15-105-daily-20240207

metrics

  • streaming upload start avg
    • opendal-s3
      image
    • nightly-0207
      image

Before introducing concurrent uploading for opendal-s3, this metrics opendal side will be 100 times slower, but now both sides are basically the same.

  • streaming upload finish avg
    • opendal-s3
      image
    • nightly-0207
      image

Also streaming upload finish avg are basically the same now.

  • read p99
    • opendal-s3
      image
    • nightly-0207
      image

Previously, for this metrics(read p99/pmax), there are a lot of burrs in opendal side, we find that it's because the retry strategy is different. After turning retry config to be consistent with s3-sdk, this metrics are alse basically the same now.


Above are three abnormal metrics found in last time benchmark, and now it seems that all have been fixed or optimized. @Xuanwo

Abnormal metrics found this time

  • upload avg meta
    • opendal-s3
      image
    • nightly-0207
      image

@Xuanwo
Copy link
Contributor

Xuanwo commented Feb 8, 2024

What does upload avg meta measure?

@wcy-fdu
Copy link
Contributor Author

wcy-fdu commented Feb 8, 2024

What does upload avg meta measure?

checkpointing

@wcy-fdu wcy-fdu requested a review from Li0k February 8, 2024 08:51
@wcy-fdu wcy-fdu enabled auto-merge February 18, 2024 05:27
@Xuanwo
Copy link
Contributor

Xuanwo commented Feb 18, 2024

Hi, @wcy-fdu, do we have a final report about this? Is there anything that opendal could improve?

@wcy-fdu
Copy link
Contributor Author

wcy-fdu commented Feb 18, 2024

Hi, @wcy-fdu, do we have a final report about this? Is there anything that opendal could improve?

We will go ahead do some perf test on main branch, comparing with everyday's nightly.
And still investigate the abnormal metricsupload avg meta, which indicate the checkpoint latency.

@Xuanwo
Copy link
Contributor

Xuanwo commented Feb 18, 2024

And still investigate the abnormal metricsupload avg meta, which indicate the checkpoint latency.

Got it, let's figure out it.

@wcy-fdu wcy-fdu added this pull request to the merge queue Feb 20, 2024
Merged via the queue into main with commit 4197ad5 Feb 20, 2024
34 checks passed
@wcy-fdu wcy-fdu deleted the wcy/opendal_s3 branch February 20, 2024 06:44
github-actions bot pushed a commit that referenced this pull request Feb 23, 2024
lmatz pushed a commit that referenced this pull request Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants